Efficient SPARQL Query Processing via Map-Reduce-Merge

نویسنده

Nate Soule

چکیده

The move towards a “semantic web” is driving the need for efficient querying ability over large datasets consisting of statements about web resources. RDF is a set of standards for describing and modeling data and is the backbone of the semantic web technologies. RDF datasets can be very large, and often are subject to complex queries with the intent of extracting and infering otherwise unseen connections within the data. MapReduce is a framework that allows for simplified development of programs for processing large data sets in a distrubuted, parallel, fault tolerant fashion. Map-Reduce provides many of the required features to support the type of querying needed in the semantic web, but historically has suffered from a lack of a natural way to process joins a critical component to RDF query processing. This paper presents a set of algorithms to support efficient processing of the core of SPARQL, an RDF query language, over an extension of Map-Reduce. A simple implementation of these algorithms is presented, and preliminary results are documented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Performance Appraisal using SPARQL & Map Reduce Technique on Web Semantics

The Semantic Web is an emerging technology which aims at making data across the globe semantically connected. The data is represented in a very simple statement like construct having a subject, predicate and an object. This can be visualized as a graph with the subject and the object as nodes and the predicate as an edge connecting the two nodes. When many statements like these are collected to...

متن کامل

Efficient SPARQL Query Evaluation via Automatic Data Partitioning

The volume of RDF data increases very fast within the last five years, e.g. the Linked Open Data cloud grows from 2 billions to 50 billions of RDF triples. With its wonderful scalability, cloud computing platform like Hadoop is a good choice for processing queries over large data sets. Previous works on evaluating SPARQL queries with Hadoop mainly focus on reducing the number of joins through c...

متن کامل

Cascading map-side joins over HBase for scalable join processing

One of the major challenges in large-scale data processing with MapReduce is the smart computation of joins. Since Semantic Web datasets published in RDF have increased rapidly over the last few years, scalable join techniques become an important issue for SPARQL query processing as well. In this paper, we introduce the Map-Side Index Nested Loop Join (MAPSIN join) which combines scalable index...

متن کامل

Federated SPARQL Query Processing Via CostFed

Efficient source selection and optimized query plan generation belong to the most important optimization steps in federated query processing. This paper presents a demo of CostFed, an index-assisted federation engine for federated SPARQL query processing. CostFed’s source selection and query planning is based on the index generated from the SPARQL endpoints. The key innovation behind CostFed is...

متن کامل

RP-Filter: A Path-Based Triple Filtering Method for Efficient SPARQL Query Processing

With the rapid increase of RDF data, the SPARQL query processing has received much attention. Currently, most RDF databases store RDF data in a relational table called triple table and carry out several join operations on the triple tables for SPARQL query processing. However, the execution plans with many joins might be inefficient due to a large amount of intermediate data being passed betwee...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Efficient SPARQL Query Processing via Map-Reduce-Merge

نویسنده

چکیده

منابع مشابه

Query Performance Appraisal using SPARQL & Map Reduce Technique on Web Semantics

Efficient SPARQL Query Evaluation via Automatic Data Partitioning

Cascading map-side joins over HBase for scalable join processing

Federated SPARQL Query Processing Via CostFed

RP-Filter: A Path-Based Triple Filtering Method for Efficient SPARQL Query Processing

عنوان ژورنال:

اشتراک گذاری